dynamic factor
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Texas (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
Barami, Tal, Berman, Nimrod, Naiman, Ilan, Hason, Amos H., Ezra, Rotem, Azencot, Omri
Learning disentangled representations in sequential data is a key goal in deep learning, with broad applications in vision, audio, and time series. While real-world data involves multiple interacting semantic factors over time, prior work has mostly focused on simpler two-factor static and dynamic settings, primarily because such settings make data collection easier, thereby overlooking the inherently multi-factor nature of real-world data. We introduce the first standardized benchmark for evaluating multi-factor sequential disentanglement across six diverse datasets spanning video, audio, and time series. Our benchmark includes modular tools for dataset integration, model development, and evaluation metrics tailored to multi-factor analysis. We additionally propose a post-hoc Latent Exploration Stage to automatically align latent dimensions with semantic factors, and introduce a Koopman-inspired model that achieves state-of-the-art results. Moreover, we show that Vision-Language Models can automate dataset annotation and serve as zero-shot disentanglement evaluators, removing the need for manual labels and human intervention. Together, these contributions provide a robust and scalable foundation for advancing multi-factor sequential disentanglement. Our code is available on GitHub, and the datasets and trained models are available on Hugging Face.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Israel (0.04)
- Media > Music (0.46)
- Leisure & Entertainment (0.46)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Texas (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > Middle East > Israel (0.04)
DiffSDA: Unsupervised Diffusion Sequential Disentanglement Across Modalities
Zisling, Hedi, Naiman, Ilan, Berman, Nimrod, Suwajanakorn, Supasorn, Azencot, Omri
Unsupervised representation learning, particularly sequential disentanglement, aims to separate static and dynamic factors of variation in data without relying on labels. This remains a challenging problem, as existing approaches based on variational autoencoders and generative adversarial networks often rely on multiple loss terms, complicating the optimization process. Furthermore, sequential disentanglement methods face challenges when applied to real-world data, and there is currently no established evaluation protocol for assessing their performance in such settings. Recently, diffusion models have emerged as state-of-the-art generative models, but no theoretical formalization exists for their application to sequential disentanglement. In this work, we introduce the Diffusion Sequential Disentanglement Autoencoder (DiffSDA), a novel, modal-agnostic framework effective across diverse real-world data modalities, including time series, video, and audio. DiffSDA leverages a new probabilistic modeling, latent diffusion, and efficient samplers, while incorporating a challenging evaluation protocol for rigorous testing. Our experiments on diverse real-world benchmarks demonstrate that DiffSDA outperforms recent state-of-the-art methods in sequential disentanglement.
Wind Power Scenario Generation based on the Generalized Dynamic Factor Model and Generative Adversarial Network
Cho, Young-ho, Zhu, Hao, Lee, Duehee, Baldick, Ross
--For conducting resource adequacy studies, we synthesize multiple long-term wind power scenarios of distributed wind farms simultaneously by using the spatio-temporal features: spatial and temporal correlation, waveforms, marginal and ramp rates distributions of waveform, power spectral densities, and statistical characteristics. Generating the spatial correlation in scenarios requires the design of common factors for neighboring wind farms and antithetical factors for distant wind farms. The generalized dynamic factor model (GDFM) can extract the common factors through cross spectral density analysis, but it cannot closely imitate waveforms. The GAN can synthesize plausible samples representing the temporal correlation by verifying samples through a fake sample discriminator . T o combine the advantages of GDFM and GAN, we use the GAN to provide a filter that extracts dynamic factors with temporal information from the observation data, and we then apply this filter in the GDFM to represent both spatial and frequency correlations of plausible waveforms. Numerical tests on the combination of GDFM and GAN have demonstrated performance improvements over competing alternatives in synthesizing wind power scenarios from Australia, better realizing plausible statistical characteristics of actual wind power compared to alternatives such as the GDFM with a filter synthesized from distributions of actual dynamic filters and the GAN with direct synthesis without dynamic factors. ESOURCE adequacy means to maintain power system reliability by having sufficient capacity such that, even with failures or variability of resources, the probability of not being able to meet all load is sufficiently small [1]. System operators achieve resource adequacy of a power system by ensuring there is enough generation capacity [2]. In the case of intermittent energy resources, the effective load carrying capacity (ELCC) of the intermittent resource is the equivalent capacity of highly reliable generators that would result in the same probability of not being able to meet all load [3]. For example, the ELCC of wind power can be obtained by simulating power systems with long-term wind power scenarios with realistic ramping rates and marginal distributions [4]. Furthermore, the capacity factor and reserve margin contribution of wind power to the power system reliability can also be obtained by simulating a future power system by using realistic long-term wind power scenarios [5].
FG-PE: Factor-graph Approach for Multi-robot Pursuit-Evasion
Esfahani, Messiah Abolfazli, Başar, Ayşe, Saeedi, Sajad
With the increasing use of robots in daily life, there is a growing need to provide robust collaboration protocols for robots to tackle more complicated and dynamic problems effectively. This paper presents a novel, factor graph-based approach to address the pursuit-evasion problem, enabling accurate estimation, planning, and tracking of an evader by multiple pursuers working together. It is assumed that there are multiple pursuers and only one evader in this scenario. The proposed method significantly improves the accuracy of evader estimation and tracking, allowing pursuers to capture the evader in the shortest possible time and distance compared to existing techniques. In addition to these primary objectives, the proposed approach effectively minimizes uncertainty while remaining robust, even when communication issues lead to some messages being dropped or lost. Through a series of comprehensive experiments, this paper demonstrates that the proposed algorithm consistently outperforms traditional pursuit-evasion methods across several key performance metrics, such as the time required to capture the evader and the average distance traveled by the pursuers. Additionally, the proposed method is tested in real-world hardware experiments, further validating its effectiveness and applicability.
- North America > Canada > Ontario > Toronto (0.04)
- Europe (0.04)
Sequential Representation Learning via Static-Dynamic Conditional Disentanglement
Simon, Mathieu Cyrille, Frossard, Pascal, De Vleeschouwer, Christophe
This paper explores self-supervised disentangled representation learning within sequential data, focusing on separating time-independent and time-varying factors in videos. We propose a new model that breaks the usual independence assumption between those factors by explicitly accounting for the causal relationship between the static/dynamic variables and that improves the model expressivity through additional Normalizing Flows. A formal definition of the factors is proposed. This formalism leads to the derivation of sufficient conditions for the ground truth factors to be identifiable, and to the introduction of a novel theoretically grounded disentanglement constraint that can be directly and efficiently incorporated into our new framework. The experiments show that the proposed approach outperforms previous complex state-of-the-art techniques in scenarios where the dynamics of a scene are influenced by its content.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)